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RECEIVED 

APR 2 4 2002 
TECH CENTER 160012900 



RAW SEQUENCE LISTING 

PATENT APPLICATION: US/09/843,250 




DATE: 03/04/2002 
TIME: 15:02:39 



Input Set : A:\09-843250 Sequence Listing.txt 
Output Set: N:\CRF3\03042002\I843250.raw 

4 <110> APPLICANT: Parales, R. 

5 Gibson, D. 

6 Resnick, S. 

7 Lee, K. 

9 <120> TITLE OF INVENTION: Novel naphthalene dioxygenase and methods 

11 <130> FILE REFERENCE: 875.006US2 

13 <140> CURRENT APPLICATION NUMBER: US 09/843,250 

14 <141> CURRENT FILING DATE: 2001-04-26 

16 <150> PRIOR APPLICATION NUMBER: PCT/US99/2 5079 

17 <151> PRIOR FILING DATE: 1999-10-26 

19 <150> PRIOR APPLICATION NUMBER: US 60/105,575 

20 <151> PRIOR FILING DATE: 1998-10-26 
22 <160> NUMBER OF SEQ ID NOS : 65 

24 <170> SOFTWARE: FastSEQ for Windows Version 4.0 

26 <210> SEQ ID NO: 1 

27 <211> LENGTH: 2265 

28 <212> TYPE: DNA 

29 <213> ORGANISM: Artificial Sequence 

31 <220> FEATURE: 

32 <223> OTHER INFORMATION: A sequence encoding an NDO mutant. 

34 <400> SEQUENCE: 1 

35 gagggtagag aaatcgaatg ccccttgcat caaggtcggt ttgacgtttg cacaggcaaa 

36 gccctgtgcg cacccgtgac acagaacatc aaaacatatc cagtcaagat tgagaacctg 

37 cgcgtaatga ttgatttgag ctaagaattt taacaggagg caccccgggc cctagagcgt 

38 aatcaccccc attccatctt ttttaggtga aaacatgaat tacaataata aaatcttggt 

39 aagtgaatct ggtctgagcc aaaagcacct gattcatggc gatgaagaac ttttccaaca 

40 tgaactgaaa accatttttg cgcggaactg gctttttctc actcatgata gcctgattcc 

41 tgcccccggc gactatgtta ccgcaaaaat ggggattgac gaggtcatcg tctcccggca 

42 gaacgacggt tcgattcgtg cttttctgaa cgtttgccgg catcgtggca agacgctggt 
4 3 gagcgtggaa gccggcaatg ccaaaggttt tgtttgcagc tatcacggct ggggcttcgg 
44 ctccaacggt gaactgcaga gcgttccatt tgaaaaagat ctgtacggcg agtcgctcaa 
4 5 taaaaaatgt ctggggttga aagaagtcgc tcgcgtggag agcttccatg gcttcatcta 
4 6 cggttgcttc gaccaggagg cccctcctct tatggactat ctgggtgacg ctgcttggta 
4 7 cctggaacct atgttcaagc attccggcgg tttagaactg gtcggtcctc caggcaaggt 
48 tgtgatcaag gccaactgga aggcacccgc ggaaaacttt gtgggagatg cataccacgt 
4 9 gggttggacg cacgcgtctt cgcttcgctc gggggagtct atcttctcgt cgctcgctgg 

50 caatgcggcg ctaccacctg aaggcgcagg cttgcaaatg acctccaaat acggcagcgg 

51 catgggtgtg ttgtgggacg gatattcagg tgtgcatagc gcagacttgg ttccggaatt 

52 gatggcattc ggaggcgcaa agcaggaaag gctgaacaaa gaaattggcg atgttcgcgc 

53 tcggatttat cgcagccacc tcaactgcac cgttttcccg aacaacagca tgctgacctg 

54 ctcgggtgtt ttcaaagtat ggaacccgat cgacgcaaac accaccgagg tctggaccta 

55 cgccattgtc gaaaaagaca tgcctgagga tctcaagcgc cgcttggccg actctgttca 

56 gcgaacggtc gggcctgctg gcttctggga aagcgacgac aatgacaata tggaaacagc 



for their use 
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57 ttcgcaaaac ggcaagaaat atcaatcaag agatagtgat ctgctttcaa accttggttt 1380 

58 cggtgaggac gtatacggcg acgcggtcta tccaggcgtc gtcggcaaat cggcgatcgg 1440 

59 cgagaccagt tatcgtggtt tctaccgggc ttaccaggca cacgtcagca gctccaactg 1500 
6 0 ggctgagttc gagcatgcct ctagtacttg gcatactgaa cttacgaaga ctactgatcg 156 0 

61 ctaacagacg agtcgaccat gatgatcaat attcaagaag acaagctggt ttccgcccac 1620 

62 gacgccgaag agattcttcg tttcttcaat tgccacgact ctgctttgca acaagaagcc 1680 

63 actacgctgc tgacccagga agcgcatttg ttggacattc aggcttaccg tgcttggtta 1740 

64 gagcactgcg tggggtcaga ggtgcaatat caggtcattt cacgcgaact gcgcgcagct 1800 

65 tcagagcgtc gttataagct caatgaagcc atgaacgttt acaacgaaaa ttttcagcaa 1860 

66 ctgaaagttc gagttgagca. tcaactggat ccgcaaaact ggggcaacag cccgaagctg 1920 

67 cgctttactc gctttatcac caacgtccag gccgcaatgg acgtaaatga caaagagcta 1980 

68 cttcacatcc gctccaacgt cattctgcac cgggcacgac gtggcaatca ggtcgatgtc 2040 

69 ttctacgccg cccgggaaga taaatggaaa cgtggcgaag gtggagtacg aaaattggtc 2100 

70 cagcgattcg tcgattaccc agagcgcata cttcagacgc acaatctgat ggtctttctg 2160 

71 tgattcagtg accattttta caaatggtca ctgcaaccgc ggtcaccatt aatcaaaggg 2220 

72 aatgtacgtg tatgggcaat caacaagtcg tttcgataac cggtg 2265 

74 <210> SEQ ID NO: 2 

75 <211> LENGTH: 449 

76 <212> TYPE: PRT 

77 <213> ORGANISM: Artificial Sequence 

79 <220> FEATURE: 

80 <223> OTHER INFORMATION: A polypeptide encoded by SEQ ID NO : 1 

82 <400> SEQUENCE: 2 

83 Met Asn Tyr Asn Asn Lys lie Leu Val Ser Glu Ser Gly Leu Ser Gin 

84 1 5 10 15 

85 Lys His Leu lie His Gly Asp Glu Glu Leu Phe Gin His Glu Leu Lys 



86 



20 



25 



30 



87 Thr lie Phe Ala Arg Asn Trp Leu Phe Leu Thr His Asp Ser Leu lie 

88 35 40 45 

89 Pro Ala Pro Gly Asp Tyr Val Thr Ala Lys Met Gly lie Asp Glu Val 

90 50 55 60 

91 lie Val Ser Arg Gin Asn Asp Gly Ser lie Arg Ala Phe Leu Asn Val 

92 65 70 75 80 

93 Cys Arg His Arg Gly Lys Thr Leu Val Ser Val Glu Ala Gly Asn Ala 

94 85 90 95 

9 5 Lys Gly Phe Val Cys Ser Tyr His Gly Trp Gly Phe Gly Ser Asn Gly 

96 100 105 110 

97 Glu Leu Gin Ser Val Pro Phe Glu Lys Asp Leu Tyr Gly Glu Ser Leu 

98 115 120 125 

99 Asn Lys Lys Cys Leu Gly Leu Lys Glu Val Ala Arg Val Glu Ser Phe 

100 130 135 140 

101 His Gly Phe lie Tyr Gly Cys Phe Asp Gin Glu Ala Pro Pro Leu Met 

102 145 150 155 160 

103 Asp Tyr Leu Gly Asp Ala Ala Trp Tyr Leu Glu Pro Met Phe Lys His 

104 165 170 175 

105 Ser Gly Gly Leu Glu Leu Val Gly Pro Pro Gly Lys Val Val lie Lys 

106 180 185 190 

107 Ala Asn Trp Lys Ala Pro Ala Glu Asn Phe Val Gly Asp Ala Tyr His 

108 195 200 205 
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139 Arg 

142 <210> SEQ ID NO: 3 

143 <211> LENGTH: 9841 

144 <212> TYPE: DNA 

145 <213> ORGANISM: Artificial Sequence 

147 <220> FEATURE: 

148 <223> OTHER INFORMATION: A modified DNA molecule encoding valine at the 

149 position corresponding to the F352 amino acid in 

150 NDO. 

152 <400> SEQUENCE: 3 

153 gaattcatca ggaagacatt caaatgaacg taaacaataa gggcagcgtc tgtatttgcg 60 

154 geagegaaat gctccctaaa ttcctcattt accccatctg aggattgett tatgacagta 120 

155 aagtggattg aagcagtege tctttctgac atccttgaag gtgaegtect eggegtgact 180 

156 gtcgagggca aggagctggc gctgtatgaa gttgaaggcg aaatctaege taccgacaac 240 

157 ctgtgcacgc atggttccgc ccgcatgagt gatggttatc tcgagggtag agaaatcgaa 300 

158 tgccccttgc atcaaggtcg gtttgacgtt tgeacaggea aagccctgtg cgcacccgtg 360 

159 acacagaaca tcaaaacata tccagtcaag attgagaacc tgcgcgtaat gattgatttg 420 

160 agctaagaat tttaacagga ggcaccccgg gccctagagc gtaatcaccc ccattccatc 480 

161 ttttttaggt gaaaacatga attacaataa taaaatcttg gtaagtgaat ctggtctgag 540 
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162 ccaaaagcac 

163 tgcgcggaac 

164 taccgcaaaa 

165 tgcttttctg 

166 tgccaaaggt 

167 gagcgttcca 

168 gaaagaagtc 

169 ggcccctcct 

170 gcattccggc 

171 gaaggcaccc 

172 ttcgcttcgc 

173 tgaaggcgca 

174 cggatattca 

175 aaagcaggaa 

176 cctcaactgc 

177 atggaacccg 

178 catgcctgag 

179 tggcttctgg 

180 atatcaatca 

181 cgacgcggtc 

182 tttctaccgg 

183 ctctagtact 

184 atgatgatca 

185 cgtttcttca 

186 gaagcgcatt 

187 gaggtgcaat 

188 ctcaatgaag 

189 catcaactgg 

190 accaacgtcc 

191 gtcattctgc 

192 gataaatgga 

193 ccagagcgca 

194 tacaaatggt 

195 atcaacaagt 

196 cctttaagtc 

197 ttctttgcaa 

198 caacaaatga 

199 ttgcaaatgc 

200 cgagcagttt 

201 ccgccctgcc 

202 cccatgcggt 

203 tggttaaggc 

204 ggggcaccgt 

205 aagacatgcc 

206 agcccgaaga 

207 tcaccggcac 

208 gccgatcaga 

209 aataacgcct 

210 agcgatgtgg 



Input Set : 
Output Set : 

ctgattcatg 
tggctttttc 
atggggattg 
aacgtttgcc 
tttgtttgca 
tttgaaaaag 
gctcgcgtgg 
cttatggact 
ggtttagaac 
gcggaaaact 
tcgggggagt 
ggcttgcaaa 
ggtgtgcata 
aggctgaaca 
accgttttcc 
atcgacgcaa 
gatctcaagc 
gaaagcgacg 
agagatagtg 
tatccaggcg 
gcttaccagg 
tggcatactg 
atattcaaga 
attgccacga 
tgttggacat 
atcaggtcat 
ccatgaacgt 
atccgcaaaa 
aggccgcaat 
accgggcacg 
aacgtggcga 
tacttcagac 
cactgcaacc 
cgtttcgata 
ggccggttat 
agagttcaag 
gaagctgata 
cggtatctgg 
tgacgaaata 
ggaactgaaa 
cggtggtggt 
tttggcctac 
gacgtctctg 
cggcatcgac 
cgtggtggca 
cgtgattagc 
agttatagac 
ggatcgattc 
tgactgagag 



A:\09-843250 Sequence Listing.txt 
N:\CRF3\03042002\I843250.raw 



gcgatgaaga 
tcactcatga 
acgaggtcat 
ggcatcgtgg 
gctatcacgg 
atctgtacgg 
agagcttcca 
atctgggtga 
tggtcggtcc 
ttgtgggaga 
ctatcttctc 
tgacctccaa 
gcgcagactt 
aagaaattgg 
cgaacaacag 
acaccaccga 
gccgcttggc 
acaatgacaa 
atctgctttc 
tcgtcggcaa 
cacacgtcag 
aacttacgaa 
agacaagctg 
ctctgctttg 
tcaggcttac 
ttcacgcgaa 
ttacaacgaa 
ctggggcaac 
ggacgtaaat 
acgtggcaat 
aggtggagta 
gcacaatctg 
gcggtcacca 
accggtgcag 
tacgtatccg 
gacgcactcg 
aagcaaacaa 
gattacatgc 
ttcgacatta 
aagactaacg 
ggttcttgct 
gaattggccc 
tgcggtcccg 
gatatgatca 
ccctatttgt 
attgatggcg 
acatttcagg 
tagtgaccag 
cgcaaacgcc 



acttttccaa 
tagcctgatt 
cgtctcccgg 
caagacgctg 
ctggggcttc 
cgagtcgctc 
tggcttcatc 
cgctgcttgg 
tccaggcaag 
tgcataccac 
gtcgctcgct 
atacggcagc 
ggttccggaa 
cgatgttcgc 
catgctgacc 
ggtctggacc 
cgactctgtt 
tatggaaaca 
aaaccttggt 
atcggcgatc 
cagctccaac 
gactactgat 
gtttccgccc 
caacaagaag 
cgtgcttggt 
ctgcgcgcag 
aattttcagc 
agcccgaagc 
gacaaagagc 
caggtcgatg 
cgaaaattgg 
atggtctttc 
ttaatcaaag 
gctcaggaat 
ctctcgtacg 
agattgtagt 
tcgatagatt 
tgagcatcga 
atgtcaagag 
gatcagtggt 
acatcgccag 
ccgaagttcg 
cgagcgccgg 
aaggtctcac 
tgctggcttc 
gtatggcgct 
tgacgcccca 
cagaccttcg 
acagtgacgg 



catgaactga 
cctgcccccg 
cagaacgacg 
gtgagcgtgg 
ggctccaacg 
aataaaaaat 
tacggttgct 
tacctggaac 
gttgtgatca 
gtgggttgga 
ggcaatgcgg 
ggcatgggtg 
ttgatggcat 
gctcggattt 
tgctcgggtg 
tacgccattg 
cagcgaacgg 
gcttcgcaaa 
ttcggtgagg 
ggcgagacca 
tgggctgagt 
cgctaacaga 
acgacgccga 
ccactacgct 
tagagcactg 
cttcagagcg 
aactgaaagt 
tgcgctttac 
tacttcacat 
tcttctacgc 
tccagcgatt 
tgtgattcag 
ggaatgtacg 
cggtctcgaa 
aaacgaggag 
gggcgatgtc 
cggtcatctt 
agagccttgg 
ctatttcagt 
gatgaccgct 
caagcatgcg 
cgtgaacgct 
tttcgacaaa 
gcctcttggg 
gcgaaagcaa 
cggtcgcaag 
tgaagacaaa 
agcgcataca 
acgcgataaa 



aaaccatttt 
gcgactatgt 
gttcgattcg 
aagccggcaa 
gtgaactgca 
gtctggggtt 
tcgaccagga 
ctatgttcaa 
aggccaactg 
cgcacgcgtc 
cgctaccacc 
tgttgtggga 
tcggaggcgc 
atcgcagcca 
ttttcaaagt 
tcgaaaaaga 
tcgggcctgc 
acggcaagaa 
acgtatacgg 
gttatcgtgg 
tcgagcatgc 
cgagtcgacc 
agagattctt 
gctgacccag 
cgtggggtca 
tcgttataag 
tcgagttgag 
tcgctttatc 
ccgctccaac 
cgcccgggaa 
cgtcgattac 
tgaccatttt 
tgtatgggca 
ctggttcggt 
caagaggcgc 
cgggaccacg 
gattgtttta 
gagaaaatat 
ggcatcagtg 
tcggtgtcgt 
gtgctcggta 
gtttcgccgg 
atgcacatga 
tttgcagcca 
ggaaaattca 
tgagcttgta 
actgtttatc 
ccccgtcagc 
ggcggcgcaa 
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211 gcggccgagg 

212 ctcctaaagg 

213 atggaggtgg 

214 ttccgagagg 

215 gccgaaacgc 

216 tggaacggca 

217 actgtggtgt 

218 gtgcaggaag 

219 tcgcccgaga 

220 ggttccaccc 

221 ctgctggagc 

222 gcggtcaagg 
22 3 gagcgcttga 
224 actaagcgct 
22 5 gtctcgccaa 

226 gcaaaagttg 

227 cacgtcaaat 

228 atccgttgta 

229 tcgtcgggcg 

230 tatggttctg 

231 ggcaccaaga 

232 gagatcaagt 

233 actcccagga 

234 atctcggtca 

235 gttcttgatg 
2 36 atcgtagtcc 
2 37 aagccggagt 
238 tgcgacaaag 
2 39 ggcggcaacc 
24 0 cccggtcgcc 

241 gttcgccaaa 

242 gacgtcgaat 
24 3 tgcaacgccc 
244 cacttgatgc 
24 5 aagaacgaaa 

246 ttctatggtg 

247 atagatgaag 

248 tatggcctgg 

249 gcatcttcat 
2 50 ggaagtggtt 

251 gataggtgaa 

252 tatctattca 

253 tgtcttgtgt 

254 acgtaccgta 
2 55 ggtcagcggg 
2 56 ttcggctacg 

257 actaaagtga 

258 ccaacagtat 
2 59 ggtgcgggag 



aggcgttcaa 
tcgccgatgt 
gagcttccgc 
ctgcctcgct 
tctcaatgac 
ccgcagtgct 
tcaaaggctc 
ccgggctgcc 
tcgctgacgc 
gcgtgggcag 
tcggcggcaa 
cagcggtgtt 
tcgttgatga 
tgagcgcagg 
attcgggtga 
tttgcggcgg 
ctgacatgcg 
aaggcgaagc 
tatttggccg 
tacacatcaa 
acaccggcta 
ggctgaccat 
atcaaactat 
aggaccctga 
agggtgagaa 
atcacaacgg 
tcgaagctct 
ttgaggctca 
cgaccgagat 
ccctgcacgg 
ccgacgtcgc 
accggattcc 
gtgatcactc 
ttgagtacac 
ttgacattgc 
caacgccttc 
cggagtatta 
atgtaaaact 
acgcaaccaa 
cgggccatgc 
tcaagcgctt 
aaacaagaat 
atcctctatt 
ttggtatgac 
tgccaggagg 
ccatcaacga 
cgggcgcagg 
taacgttgaa 
tcaattacac 



gacctggaag 
catggaaagt 
cctttgggcc 
ggctacccaa 
actacgtcag 
tgcggcacga 
tgaatttagt 
cgctggcgtg 
actgatctct 
cattatcgcg 
gtccccgctt 
cggtagcttc 
gaagatagcc 
cgacccgtgc 
gcggatcaat 
cttggcccaa 
gatttacgat 
agaggccgtc 
cgacatcaac 
cggttcgacc 
cgggcgcttc 
cgaacctttc 
gagtaagcaa 
tgcgtggaaa 
ggaccgtttc 
acaggacgac 
gggtcaaaag 
ggagcgtatg 
attctggggc 
aaagtttgtg 
agaagctcat 
gttgcccaac 
cattgctttt 
ccatatggaa 
cttgcagctt 
gggctggctc 
cgtcggcgac 
gagctaaaga 
ccttgcaggg 
gcataccgat 
agtcaactag 
aataaatagg 
ctgtttggca 
taatgtagct 
aagcgctgat 
ccagtggaat 
cacacttcct 
ctataacctc 
gcggattttt 



gccgttggac 
aaaacaccca 
ggattcaacg 
attcagggtg 
ccggtcggcc 
gccatcgctt 
cccgcgacgc 
ctcaattacc 
gccaaggaga 
cagaaagccg 
attgttctgg 
ctgttccaag 
gacgaatttg 
gtaactggcg 
ggtttgttca 
ggtgcgctca 
gaggagacct 
cgcattgcca 
cgcgctctac 
gtccagaacg 
gacggccgtg 
gagcagcaat 
gctgcagtta 
tcatttgcca 
tatctgcgga 
ttggagtacc 
cttattgatg 
gtgttgggtc 
ccccggatcg 
accggtgacc 
aagttttata 
ggcatgactg 
ggtgccatgc 
gacttgggat 
ggcattcacg 
attgagcccg 
atcttcggcc 
tgcgcgctcg 
cgatgagatc 
ccatgacatt 
tggacacatc 
atgaaaataa 
agccccacat 
ttcgatgcta 
gcgagcgata 
gtacgtgcga 
ggtatccagc 
cccgctttgg 
gaaagtcggg 



cttcagagcg 
agttcatcga 
tccatgcgtc 
aaaccatccc 
cgatcctaag 
atccgctggt 
atgccctgat 
tcaactcttc 
tccgccgcat 
cgcaacacct 
atgatgcaga 
gtcagatctg 
tcgcaaaatt 
actgcatcat 
aagacgcgat 
tgccggccac 
ttggtcccat 
acgacagcgt 
gcgtgggtat 
aggcgcaggc 
ctgtaatcga 
atcccttctg 
tcgagctcgg 
cggatatgct 
tggattactg 
taggctggcg 
ccggttacaa 
tgatgaagac 
acatgagcaa 
aaggcttggg 
gcctgctggg 
ccgaactgtc 
ccgctgccaa 
acacgcacca 
ccaacgacaa 
gctggcgagg 
atggcgtgga 
ttgggcgagg 
aaaggacgtt 
tgtttcatag 
tgttccatga 
taatgataaa 
gggccgaaga 
gcgcaaaagt 
acaacgcgct 
ttgtcggtat 
tggggaaaat 
gtcccgttcg 
acgctaatct 
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Use of n and / or Xaa has been detected in the 
Sequence Listing. Review the Sequence Listinq 

J?2L 8 S£ c ? TO ? )ondi nfl explanation is present 
in the <220> to <223> fields of each sequence 
using nor Xaa. H 
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VERIFICATION SUMMARY DATE: 03/04/2002 

* PATENT APPLICATION: US/09/843 f 250 TIME: 15:02:40 

Input Set : A:\09-843250 Sequence Listing.txt 
Output Set: N:\CRF3\03042002\I843250.raw 

L:569 M:341 W: (46) "n" or "Xaa" used, for SEQ ID# : 6 
L:2069 M:341 W: (46) n n n or "Xaa" used, for SEQ ID#:19 
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